124 research outputs found

    Hierarchical GTM: Constructing localized non-linear projection manifolds in a principled way

    Get PDF
    It has been argued that a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex data sets, and therefore a hierarchical visualization system is desirable. In this paper we extend an existing locally linear hierarchical visualization system PhiVis ¸iteBishop98a in several directions: bf(1) We allow for em non-linear projection manifolds. The basic building block is the Generative Topographic Mapping. bf(2) We introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree. General training equations are derived, regardless of the position of the model in the tree. bf(3) Using tools from differential geometry we derive expressions for local directional curvatures of the projection manifold. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. It enables the user to interactively highlight those data in the parent visualization plot which are captured by a child model. We also incorporate into our system a hierarchical, locally selective representation of magnification factors and directional curvatures of the projection manifolds. Such information is important for further refinement of the hierarchical visualization plot, as well as for controlling the amount of regularization imposed on the local models. We demonstrate the principle of the approach on a toy data set and apply our system to two more complex 12- and 19-dimensional data sets

    A new entropy measure based on the Renyi entropy rate using Gaussian kernels

    Get PDF
    The concept of entropy rate is well defined in dynamical systems theory but is impossible to apply it directly to finite real world data sets. With this in mind, Pincus developed Approximate Entropy (ApEn), which uses ideas from Eckmann and Ruelle to create a regularity measure based on entropy rate that can be used to determine the influence of chaotic behaviour in a real world signal. However, this measure was found not to be robust and so an improved formulation known as the Sample Entropy (SampEn) was created by Richman and Moorman to address these issues. We have developed a new, related, regularity measure which is not based on the theory provided by Eckmann and Ruelle and proves a more well-behaved measure of complexity than the previous measures whilst still retaining a low computational cost

    Delay estimation for multivariate time series

    Get PDF
    Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models (HMMs) to identify the lag (or delay) between different variables for such data. We first present a method using maximum likelihood estimation and propose a simple algorithm which is capable of identifying associations between variables. We also adopt an information-theoretic approach and develop a novel procedure for training HMMs to maximise the mutual information between delayed time series. Both methods are successfully applied to real data. We model the oil drilling process with HMMs and estimate a crucial parameter, namely the lag for return

    Data visualization with simultaneous feature selection

    Get PDF
    Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE

    Analysing time series structure with hidden Markov models

    Get PDF
    This paper consides the problem of extracting the relationships between two time series in a non-linear non-stationary environment with Hidden Markov Models (HMMs). We describe an algorithm which is capable of identifying associations between variables. The method is applied both to synthetic data and real data. We show that HMMs are capable of modelling the oil drilling process and that they outperform existing methods

    Hierarchical GTM: constructing localized non-linear projection manifolds in a principled way

    Get PDF
    It has been argued that a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex data sets, and therefore a hierarchical visualization system is desirable. In this paper we extend an existing locally linear hierarchical visualization system PhiVis ¸iteBishop98a in several directions: bf(1) We allow for em non-linear projection manifolds. The basic building block is the Generative Topographic Mapping (GTM). bf(2) We introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree. General training equations are derived, regardless of the position of the model in the tree. bf(3) Using tools from differential geometry we derive expressions for local directional curvatures of the projection manifold. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. It enables the user to interactively highlight those data in the ancestor visualization plots which are captured by a child model. We also incorporate into our system a hierarchical, locally selective representation of magnification factors and directional curvatures of the projection manifolds. Such information is important for further refinement of the hierarchical visualization plot, as well as for controlling the amount of regularization imposed on the local models. We demonstrate the principle of the approach on a toy data set and apply our system to two more complex 12- and 18-dimensional data sets

    Time delay estimation with hidden Markov models

    Get PDF
    Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models to identify the lag (or delay) between different variables for such data. Adopting an information-theoretic approach, we develop a procedure for training HMMs to maximise the mutual information (MMI) between delayed time series. The method is used to model the oil drilling process. We show that cross-correlation gives no information and that the MMI approach outperforms maximum likelihood

    NEUROSAT: an overview

    Get PDF
    This report gives an overview of the work being carried out, as part of the NEUROSAT project, in the Neural Computing Research Group at Aston University. The aim is to give a general review of the work and methods, with reference to other documents which provide the detail. The document is ongoing and will be updated as parts of the project are completed. Thus some of the references are not yet present. In the broadest sense, the Aston part of NEUROSAT is about using neural networks (and other advanced statistical techniques) to extract wind vectors from satellite measurements of ocean surface radar backscatter. The work involves several phases, which are outlined below. A brief summary of the theory and application of satellite scatterometers forms the first section. The next section deals with the forward modelling of the scatterometer data, after which the inverse problem is addressed. Dealiasing (or disambiguation) is discussed, together with proposed solutions. Finally a holistic framework is presented in which the problem can be solved

    Dynamical local models for segmentation and prediction of financial time series

    Get PDF
    In the analysis and prediction of many real-world time series, the assumption of stationarity is not valid. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We introduce a new model which combines a dynamic switching (controlled by a hidden Markov model) and a non-linear dynamical system. We show how to train this hybrid model in a maximum likelihood approach and evaluate its performance on both synthetic and financial data

    Modelling financial time series with switching state space models

    Get PDF
    The deficiencies of stationary models applied to financial time series are well documented. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We use a dynamic switching (modelled by a hidden Markov model) combined with a linear dynamical system in a hybrid switching state space model (SSSM) and discuss the practical details of training such models with a variational EM algorithm due to [Ghahramani and Hilton,1998]. The performance of the SSSM is evaluated on several financial data sets and it is shown to improve on a number of existing benchmark methods
    corecore